R version: >4.1.1
OS: MacOS > 10.10, Win 7-11,
linux must have a graphic interface
# clone this repo to your machine
git clone git@github.com:ShawnWx2019/WGCNA-shinyApp.git WGCNAshiny
cd WGCNAshiny
## Method 1.
Rscript WGCNAbyClick.v1.R
## Method 2. open WGCNAbyClick.v1.R by Rstudio or other IDE you perfer and run this script.
You can get TBtools from CJ-Chen/TBtools
If you’ve utilized TBtools for your analysis, please cite:
Chen C, Chen H, Zhang Y, Thomas HR, Frank MH, He Y, Xia R. TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data. Mol Plant. 2020 Aug 3;13(8):1194-1202. doi: 10.1016/j.molp.2020.06.009.
Follow the following steps to install RServer.plugin and
WGCNA shiny.plugin
You can prepare your datExpr file following the Expression matrix flie
Data source:
transcriptomics
readcount.
expected count
normalized readcount (FPKM, RPKM, TPM, CPM)
microarray data
metabolomics
proteomics,
protein abundance.
corrected intensity
…
Format:
Gene/metabolite/protein ID in row and sample ID in column.
The sample ID should not contain spaces (), special symbols
(- @ * & #) etc., and should not start with
numbers.
DO NOT use pure numbers as gene/metabolite/protein ID.
Only accepted tab-delimited file, such as .txt or
.tsv, DO NOT use .csv, .xlsx,
.xls.
You can prepare your trait file following the trait data file The data can be quantitative traits or qualitative traits.
For qualitative:
format1.
| sample_id | type |
|---|---|
| S_0001 | treat |
| S_0002 | treat |
| S_0003 | treat |
| S_0004 | control |
| S_0005 | control |
| S_0006 | control |
if the input trait data have only 2 columns, the format1 will be automaticly transformed to format2.
format2.
| sample_id | treat | control |
|---|---|---|
| S_0001 | 1 | 0 |
| S_0002 | 1 | 0 |
| S_0003 | 1 | 0 |
| S_0004 | 0 | 1 |
| S_0005 | 0 | 1 |
| S_0006 | 0 | 1 |
For quantitative:
| sample_id | plant_height | yield |
|---|---|---|
| S_0001 | 12 | 1920 |
| S_0002 | 14 | 1930 |
| S_0003 | 13 | 1919 |
| S_0004 | 20 | 3020 |
| S_0005 | 25 | 3021 |
| S_0006 | 22 | 3320 |
According to the WGCNA FAQ, the expression data we input needs to undergo data cleaning before it can be used for WGCNA analysis. We have processed the data as follows:
vst function from DESeq2. For
already normalized count values, such as FPKM, RPKM, TPM values, etc.,
we can use the original values, or transform them using log10(x+1).“We then recommend a variance-stabilizing transformation. For example, package DESeq2 implements the function varianceStabilizingTransformation which we have found useful, but one could also start with normalized counts (or RPKM/FPKM data) and log-transform them using log2(x+1). For highly expressed features, the differences between full variance stabilization and a simple log transformation are small.”
“We suggest removing features whose counts are consistently low (for example, removing all features that have a count of less than say 10 in more than 90% of the samples) because such low-expressed features tend to reflect noise and correlations based on counts that are mostly zero aren’t really meaningful. The actual thresholds should be based on experimental design, sequencing depth and sample counts.”
“Probesets or genes may be filtered by mean expression or variance (or their robust analogs such as median and median absolute deviation, MAD) since low-expressed or non-varying genes usually represent noise. Whether it is better to filter by mean expression or variance is a matter of debate; both have advantages and disadvantages, but more importantly, they tend to filter out similar sets of genes since mean and variance are usually related.”
Parameters:
Format:
count, integer, read count from RNAseq dataexpected count, float, expected count generated by
RSEM,normalized count, float, normalized expression levels,
such as FPKM, RPKM, or TPMpeak area, float, peak area of metabolites produced by
LC-MS. Typically, we perform data cleaning and normalization on the peak
area in the analysis results.protein abundance, the results from software like PD
might be protein abundance, while MaxQuant might provide corrected
intensity. Both can be used. However, when the input is corrected
intensity, the values are less than 1. In this case, be mindful of the
threshold selection in subsequent filtering.Normalized method
raw use the raw value.
logarithm use log10(x+1)
Sample percentage 0-1, In what
percentage of samples does the expression level fall below the
cutoff
Expression Cutoff numeric, In what percentage of
samples does the expression level fall below the
cutoff
Filter Method
MAD median absolute deviation
SVR variance
Reserved genes Num integer, How many genes would you
like to retain for WGCNA analysis after the filtering process? Please
note, if this number surpasses the count of genes remaining after
filtering, all the filtered genes will be preserved.
Step by step
In the construction of a weighted co-expression network, we need to select an appropriate soft-thresholding value to build a scale-free network. The criteria for judging whether the soft-thresholding value is appropriate generally consider that the closer the mean connectivity is to 0 and the signed R^2 is to 1 under this power value, the closer the network is to a scale-free network. However, the power should not be too large, as a larger power value may lead to a higher false positive rate. Therefore, we usually choose the power value that first crosses the R2 threshold line and has a mean connectivity closest to 0 under this power value as the soft-thresholding value for subsequent analysis.
Parameters:
R^2 cutoff numeric, cutoff of scale-free topology model
fit.
Power type If the recommended power value provided by
the software is suitable, select ‘recommend’. If it is found to be
unsuitable, choose an appropriate threshold based on the two graphs on
the right, then change this option to ‘customized’, and select the
threshold you want to use in ‘final power selection’.
Step by step
Jan 21 2023 V0.0.6.230121
🍿 + New options of input data format.
🍿 + Ceil expected count.
🍬 + Progress bar in module detection and module-trait step.
🐛 + Modified some inappropriate descriptions.
⭐️ + Outlier remove.
⭐️ + IterativeWGCNA.
🍀 + Export parameter.